Towards a Semantic Annotation of English Television News - Building and Evaluating a Constraint Grammar FrameNet

نویسندگان

  • Shaohua Yang
  • Hai Zhao
  • Bao-Liang Lu
چکیده

This paper introduces a new approach to solve the Chinese Pinyin-to-character (PTC) conversion problem. The conversion from Chinese Pinyin to Chinese character can be regarded as a transformation between two different languages (from the Latin writing system of Chinese Pinyin to the character form of Chinese,Hanzi), which can be naturally solved by machine translation framework. PTC problem is usually regarded as a sequence labeling problem, however, it is more difficult than any other general sequence labeling problems, since it requires a large label set of all Chinese characters for the labeling task. The essential difficulty of the task lies in the high degree of ambiguities of Chinese characters corresponding to Pinyins. Our approach is novel in that it effectively combines the features of continuous source sequence and target sequence. The experimental results show that the proposed approach is much faster, besides, we got a better result and outperformed the existing sequence labeling approaches.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards a Semantic Annotation of English Television News - Building and Evaluating a Constraint Grammar FrameNet

This paper presents work on the semantic annotation of a multimodal corpus of English television news. The annotation is performed on the second-by-secondaligned transcript layer, adding verb frame categories and semantic roles on top of a morphosyntactic analysis with full dependency information. We use a rulebased method, where Constraint Grammar mapping rules are automatically generated from...

متن کامل

The FrameNet Constructicon

The Berkeley FrameNet Project1 has been engaged since 1997 in discovering and describing the semantic and distributional properties of words in the general vocabulary of English.2 Notions from FRAME SEMANTICS (see Fillmore and Baker 2009 and references therein) provide the basis of the semantic description of the lexical units in the database, and sentences extracted from the FrameNet (FN) text...

متن کامل

Graph Methods for Multilingual FrameNets

This paper introduces a new, graphbased view of the data of the FrameNet project, which we hope will make it easier to understand the mixture of semantic and syntactic information contained in FrameNet annotation. We show how English FrameNet and other Frame Semantic resources can be represented as sets of interconnected graphs of frames, frame elements, semantic types, and annotated instances ...

متن کامل

The Impact of Grammar Enhancement on Semantic Resources Induction

In this paper describes the effects of the evolution of an Italian dependency grammar on a task of multilingual FrameNet acquisition. The task is based on the creation of virtual English/Italian parallel annotation corpora, which are then aligned at dependency level by using two manually encoded grammar based dependency parsers. We show how the evolution of the LAS (Labeled Attachment Score) me...

متن کامل

Frame Information Transfer from English to Italian

We describe an automatic projection algorithm for transferring frame-semantic information from English to Italian texts, as a first sep towards the creation of Italian FrameNet. Projection of frame semantic information from English to other European languages has already been investigated for German, Swedish and French. With our work, we point out typical features of the Italian language as reg...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012